Overview

Dataset statistics

Number of variables15
Number of observations2000
Missing cells1892
Missing cells (%)6.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory234.5 KiB
Average record size in memory120.1 B

Variable types

Numeric8
Categorical7

Alerts

Blood_Pressure_Abnormality is highly correlated with Chronic_kidney_disease and 1 other fieldsHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Blood_Pressure_Abnormality is highly correlated with Chronic_kidney_disease and 1 other fieldsHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Blood_Pressure_Abnormality is highly correlated with Chronic_kidney_disease and 1 other fieldsHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Pregnancy is highly correlated with SexHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Sex is highly correlated with PregnancyHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Blood_Pressure_Abnormality is highly correlated with Adrenal_and_thyroid_disorders and 1 other fieldsHigh correlation
Blood_Pressure_Abnormality is highly correlated with Level_of_Hemoglobin and 3 other fieldsHigh correlation
Level_of_Hemoglobin is highly correlated with Blood_Pressure_Abnormality and 1 other fieldsHigh correlation
Genetic_Pedigree_Coefficient is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Sex is highly correlated with Level_of_HemoglobinHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_Abnormality and 1 other fieldsHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_Abnormality and 1 other fieldsHigh correlation
Genetic_Pedigree_Coefficient has 92 (4.6%) missing values Missing
Pregnancy has 1558 (77.9%) missing values Missing
alcohol_consumption_per_day has 242 (12.1%) missing values Missing
Patient_Number is uniformly distributed Uniform
Patient_Number has unique values Unique

Reproduction

Analysis started2022-04-29 11:49:12.219861
Analysis finished2022-04-29 11:49:26.061630
Duration13.84 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Patient_Number
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct2000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1000.5
Minimum1
Maximum2000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:26.154742image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile100.95
Q1500.75
median1000.5
Q31500.25
95-th percentile1900.05
Maximum2000
Range1999
Interquartile range (IQR)999.5

Descriptive statistics

Standard deviation577.4945887
Coefficient of variation (CV)0.5772059857
Kurtosis-1.2
Mean1000.5
Median Absolute Deviation (MAD)500
Skewness0
Sum2001000
Variance333500
MonotonicityStrictly increasing
2022-04-29T17:19:26.274742image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
0.1%
13301
 
0.1%
13431
 
0.1%
13421
 
0.1%
13411
 
0.1%
13401
 
0.1%
13391
 
0.1%
13381
 
0.1%
13371
 
0.1%
13361
 
0.1%
Other values (1990)1990
99.5%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
ValueCountFrequency (%)
20001
0.1%
19991
0.1%
19981
0.1%
19971
0.1%
19961
0.1%
19951
0.1%
19941
0.1%
19931
0.1%
19921
0.1%
19911
0.1%

Blood_Pressure_Abnormality
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1013 
1
987 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Length

2022-04-29T17:19:26.427449image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T17:19:26.502447image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Level_of_Hemoglobin
Real number (ℝ≥0)

HIGH CORRELATION

Distinct757
Distinct (%)37.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.710035
Minimum8.1
Maximum17.56
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:26.583447image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum8.1
5-th percentile8.58
Q110.1475
median11.33
Q312.945
95-th percentile16.01
Maximum17.56
Range9.46
Interquartile range (IQR)2.7975

Descriptive statistics

Standard deviation2.186700638
Coefficient of variation (CV)0.1867373272
Kurtosis-0.1842879759
Mean11.710035
Median Absolute Deviation (MAD)1.36
Skewness0.6570660942
Sum23420.07
Variance4.781659679
MonotonicityNot monotonic
2022-04-29T17:19:26.722420image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.0711
 
0.5%
11.5810
 
0.5%
11.549
 
0.4%
11.958
 
0.4%
11.198
 
0.4%
10.558
 
0.4%
10.388
 
0.4%
11.168
 
0.4%
10.898
 
0.4%
10.988
 
0.4%
Other values (747)1914
95.7%
ValueCountFrequency (%)
8.12
0.1%
8.121
 
0.1%
8.134
0.2%
8.152
0.1%
8.162
0.1%
8.174
0.2%
8.183
0.1%
8.191
 
0.1%
8.22
0.1%
8.211
 
0.1%
ValueCountFrequency (%)
17.561
 
0.1%
17.541
 
0.1%
17.531
 
0.1%
17.522
0.1%
17.511
 
0.1%
17.481
 
0.1%
17.451
 
0.1%
17.443
0.1%
17.391
 
0.1%
17.351
 
0.1%

Genetic_Pedigree_Coefficient
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct101
Distinct (%)5.3%
Missing92
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean0.4948165618
Minimum0
Maximum1
Zeros17
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:26.874004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.04
Q10.24
median0.49
Q30.74
95-th percentile0.9565
Maximum1
Range1
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.2917358818
Coefficient of variation (CV)0.5895839071
Kurtosis-1.17856276
Mean0.4948165618
Median Absolute Deviation (MAD)0.25
Skewness0.01517745777
Sum944.11
Variance0.08510982475
MonotonicityNot monotonic
2022-04-29T17:19:27.050861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.8632
 
1.6%
0.1330
 
1.5%
0.6328
 
1.4%
0.5627
 
1.4%
0.1727
 
1.4%
0.9926
 
1.3%
0.9525
 
1.2%
0.0625
 
1.2%
0.2525
 
1.2%
0.4625
 
1.2%
Other values (91)1638
81.9%
(Missing)92
 
4.6%
ValueCountFrequency (%)
017
0.9%
0.0123
1.1%
0.0224
1.2%
0.0317
0.9%
0.0423
1.1%
0.0515
0.8%
0.0625
1.2%
0.0711
0.5%
0.0821
1.1%
0.0921
1.1%
ValueCountFrequency (%)
118
0.9%
0.9926
1.3%
0.9819
0.9%
0.9718
0.9%
0.9615
0.8%
0.9525
1.2%
0.9418
0.9%
0.9313
0.7%
0.9221
1.1%
0.9111
0.5%

Age
Real number (ℝ≥0)

Distinct58
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.5585
Minimum18
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:27.208677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile20
Q132
median46
Q362
95-th percentile73
Maximum75
Range57
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.10783203
Coefficient of variation (CV)0.3674480928
Kurtosis-1.248231524
Mean46.5585
Median Absolute Deviation (MAD)15
Skewness0.02117832032
Sum93117
Variance292.6779167
MonotonicityNot monotonic
2022-04-29T17:19:27.451773image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1846
 
2.3%
7245
 
2.2%
7143
 
2.1%
2143
 
2.1%
6941
 
2.1%
2541
 
2.1%
5341
 
2.1%
2940
 
2.0%
4940
 
2.0%
3940
 
2.0%
Other values (48)1580
79.0%
ValueCountFrequency (%)
1846
2.3%
1927
1.4%
2029
1.5%
2143
2.1%
2234
1.7%
2327
1.4%
2435
1.8%
2541
2.1%
2634
1.7%
2737
1.8%
ValueCountFrequency (%)
7536
1.8%
7439
1.9%
7337
1.8%
7245
2.2%
7143
2.1%
7032
1.6%
6941
2.1%
6838
1.9%
6732
1.6%
6634
1.7%

BMI
Real number (ℝ≥0)

Distinct41
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.0815
Minimum10
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:27.577872image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q120
median30
Q340
95-th percentile48
Maximum50
Range40
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.7612083
Coefficient of variation (CV)0.3909781196
Kurtosis-1.182620943
Mean30.0815
Median Absolute Deviation (MAD)10
Skewness-0.0175554741
Sum60163
Variance138.3260208
MonotonicityNot monotonic
2022-04-29T17:19:27.709871image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1162
 
3.1%
3862
 
3.1%
3459
 
2.9%
2659
 
2.9%
4157
 
2.9%
2157
 
2.9%
2054
 
2.7%
3553
 
2.6%
4053
 
2.6%
2952
 
2.6%
Other values (31)1432
71.6%
ValueCountFrequency (%)
1048
2.4%
1162
3.1%
1242
2.1%
1338
1.9%
1444
2.2%
1552
2.6%
1647
2.4%
1739
1.9%
1849
2.5%
1945
2.2%
ValueCountFrequency (%)
5050
2.5%
4946
2.3%
4845
2.2%
4749
2.5%
4651
2.5%
4545
2.2%
4441
2.1%
4352
2.6%
4250
2.5%
4157
2.9%

Sex
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1008 
1
992 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Length

2022-04-29T17:19:27.837960image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T17:19:27.902954image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Pregnancy
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.5%
Missing1558
Missing (%)77.9%
Memory size15.8 KiB
0.0
243 
1.0
199 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0243
 
12.2%
1.0199
 
10.0%
(Missing)1558
77.9%

Length

2022-04-29T17:19:27.960955image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T17:19:28.014958image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0243
55.0%
1.0199
45.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Smoking
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
1019 
0
981 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Length

2022-04-29T17:19:28.076958image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T17:19:28.135959image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Physical_activity
Real number (ℝ≥0)

Distinct1951
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25254.4245
Minimum628
Maximum49980
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:28.201966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum628
5-th percentile3141.75
Q113605.75
median25353
Q337382.25
95-th percentile47170.2
Maximum49980
Range49352
Interquartile range (IQR)23776.5

Descriptive statistics

Standard deviation14015.43962
Coefficient of variation (CV)0.5549696697
Kurtosis-1.161726861
Mean25254.4245
Median Absolute Deviation (MAD)11893.5
Skewness-0.01055936725
Sum50508849
Variance196432547.8
MonotonicityNot monotonic
2022-04-29T17:19:28.330967image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
295132
 
0.1%
114322
 
0.1%
262742
 
0.1%
131442
 
0.1%
367692
 
0.1%
387692
 
0.1%
372602
 
0.1%
407722
 
0.1%
401582
 
0.1%
414152
 
0.1%
Other values (1941)1980
99.0%
ValueCountFrequency (%)
6281
0.1%
7451
0.1%
7682
0.1%
7741
0.1%
7841
0.1%
7911
0.1%
7991
0.1%
8141
0.1%
8291
0.1%
8471
0.1%
ValueCountFrequency (%)
499801
0.1%
499401
0.1%
499261
0.1%
499151
0.1%
498061
0.1%
497831
0.1%
497591
0.1%
496821
0.1%
496711
0.1%
496651
0.1%

salt_content_in_the_diet
Real number (ℝ≥0)

Distinct1945
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24926.097
Minimum22
Maximum49976
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:28.459466image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile2462.1
Q113151.75
median25046.5
Q336839.75
95-th percentile47202.25
Maximum49976
Range49954
Interquartile range (IQR)23688

Descriptive statistics

Standard deviation14211.69259
Coefficient of variation (CV)0.5701531446
Kurtosis-1.154963837
Mean24926.097
Median Absolute Deviation (MAD)11813
Skewness-0.02179784832
Sum49852194
Variance201972206.2
MonotonicityNot monotonic
2022-04-29T17:19:28.686181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
249612
 
0.1%
233322
 
0.1%
468342
 
0.1%
17692
 
0.1%
285172
 
0.1%
284652
 
0.1%
145612
 
0.1%
160052
 
0.1%
260782
 
0.1%
234682
 
0.1%
Other values (1935)1980
99.0%
ValueCountFrequency (%)
221
0.1%
441
0.1%
581
0.1%
621
0.1%
661
0.1%
1051
0.1%
1441
0.1%
1501
0.1%
1541
0.1%
1611
0.1%
ValueCountFrequency (%)
499761
0.1%
499561
0.1%
498461
0.1%
498001
0.1%
497781
0.1%
497101
0.1%
497001
0.1%
496441
0.1%
496421
0.1%
496261
0.1%

alcohol_consumption_per_day
Real number (ℝ≥0)

MISSING

Distinct488
Distinct (%)27.8%
Missing242
Missing (%)12.1%
Infinite0
Infinite (%)0.0%
Mean251.0085324
Minimum0
Maximum499
Zeros9
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2022-04-29T17:19:28.827484image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile28.85
Q1126.25
median250
Q3377.75
95-th percentile473.15
Maximum499
Range499
Interquartile range (IQR)251.5

Descriptive statistics

Standard deviation143.6518844
Coefficient of variation (CV)0.5722988101
Kurtosis-1.217678643
Mean251.0085324
Median Absolute Deviation (MAD)126
Skewness-0.008259128943
Sum441273
Variance20635.8639
MonotonicityNot monotonic
2022-04-29T17:19:28.969482image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25311
 
0.5%
14410
 
0.5%
40110
 
0.5%
30210
 
0.5%
3479
 
0.4%
09
 
0.4%
4859
 
0.4%
2068
 
0.4%
818
 
0.4%
1808
 
0.4%
Other values (478)1666
83.3%
(Missing)242
 
12.1%
ValueCountFrequency (%)
09
0.4%
13
 
0.1%
23
 
0.1%
35
0.2%
42
 
0.1%
54
0.2%
65
0.2%
84
0.2%
93
 
0.1%
114
0.2%
ValueCountFrequency (%)
4992
 
0.1%
4971
 
0.1%
4963
0.1%
4955
0.2%
4944
0.2%
4931
 
0.1%
4923
0.1%
4914
0.2%
4903
0.1%
4884
0.2%

Level_of_Stress
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
3
691 
1
666 
2
643 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Length

2022-04-29T17:19:29.101507image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T17:19:29.204939image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Chronic_kidney_disease
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1287 
1
713 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Length

2022-04-29T17:19:29.414759image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T17:19:29.492057image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Adrenal_and_thyroid_disorders
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1404 
1
596 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Length

2022-04-29T17:19:29.569047image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-29T17:19:29.642431image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-04-29T17:19:24.254697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:16.837618image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.188166image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.078201image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:20.077183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.157839image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:22.201848image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:23.253333image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.370756image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:17.186056image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.314155image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.286351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:20.252530image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.285336image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:22.354175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:23.395746image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.478779image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:17.355898image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.426692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.390382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:20.426712image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.404516image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:22.518345image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:23.517149image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.591801image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:17.526046image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.539671image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.503205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:20.556025image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.514515image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:22.642386image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:23.678589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.698756image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:17.658216image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.654718image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.616341image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:20.673132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.629002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:22.743385image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:23.790586image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.812200image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:17.795190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.767317image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.731341image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:20.790065image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.756352image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:22.866461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:23.923637image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.926603image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:17.922132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.874367image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.843390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:20.893125image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.869806image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:22.998236image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.028310image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:25.144209image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.044360image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:18.974184image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:19.962188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.006338image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:21.990305image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:23.136376image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-29T17:19:24.142663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-04-29T17:19:29.727454image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-29T17:19:29.988571image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-29T17:19:30.236870image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-29T17:19:30.433683image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-04-29T17:19:30.590666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-29T17:19:25.393338image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-29T17:19:25.649740image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-04-29T17:19:25.833900image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-04-29T17:19:25.947635image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Patient_NumberBlood_Pressure_AbnormalityLevel_of_HemoglobinGenetic_Pedigree_CoefficientAgeBMISexPregnancySmokingPhysical_activitysalt_content_in_the_dietalcohol_consumption_per_dayLevel_of_StressChronic_kidney_diseaseAdrenal_and_thyroid_disorders
01111.280.90342311.004596148071NaN211
1209.750.2354331NaN02610625333205.0300
23110.790.9170490NaN099952946567.0210
34011.000.4371500NaN0106357439242.0100
45114.170.8352190NaN01561949644397.0200
56011.640.5423480NaN1270427513NaN300
67111.690.75434111.003836932967206.0311
78012.700.4148200NaN02978126749134.0200
89010.880.6872440NaN0814960799.0300
910114.560.6140440NaN012781271595.0200

Last rows

Patient_NumberBlood_Pressure_AbnormalityLevel_of_HemoglobinGenetic_Pedigree_CoefficientAgeBMISexPregnancySmokingPhysical_activitysalt_content_in_the_dietalcohol_consumption_per_dayLevel_of_StressChronic_kidney_diseaseAdrenal_and_thyroid_disorders
19901991111.210.0163250NaN132903454050.0300
19911992115.530.1222240NaN04832516514NaN211
1992199319.380.4960391NaN14659129557125.0111
1993199409.691.0073421NaN1433443623048.0300
19941995011.070.6658311NaN03860322836379.0200
19951996110.140.0269261NaN12611847568144.0310
19961997111.771.00244511.0125728063NaN311
19971998116.910.2218420NaN01493324753NaN211
19981999011.150.7246451NaN11815715275253.0300
19992000111.360.0941450NaN02072930463230.0110